17 research outputs found

    Perceptual Image Similarity Metrics and Applications.

    Full text link
    This dissertation presents research in perceptual image similarity metrics and applications, e.g., content-based image retrieval, perceptual image compression, image similarity assessment and texture analysis. The first part aims to design texture similarity metrics consistent with human perception. A new family of statistical texture similarity features, called Local Radius Index (LRI), and corresponding similarity metrics are proposed. Compared to state-of-the-art metrics in the STSIM family, LRI-based metrics achieve better texture retrieval performance with much less computation. When applied to the recently developed perceptual image coder, Matched Texture Coding (MTC), they enable similar performance while significantly accelerating encoding. Additionally, in photographic paper classification, LRI-based metrics also outperform pre-existing metrics. To fulfill the needs of texture classification and other applications, a rotation-invariant version of LRI, called Rotation-Invariant Local Radius Index (RI-LRI), is proposed. RI-LRI is also grayscale and illuminance insensitive. The corresponding similarity metric achieves texture classification accuracy comparable to state-of-the-art metrics. Moreover, its much lower dimensional feature vector requires substantially less computation and storage than other state-of-the-art texture features. The second part of the dissertation focuses on bilevel images, which are images whose pixels are either black or white. The contributions include new objective similarity metrics intended to quantify similarity consistent with human perception, and a subjective experiment to obtain ground truth for judging the performance of objective metrics. Several similarity metrics are proposed that outperform existing ones in the sense of attaining significantly higher Pearson and Spearman-rank correlations with the ground truth. The new metrics include Adjusted Percentage Error, Bilevel Gradient Histogram, Connected Components Comparison and combinations of such. Another portion of the dissertation focuses on the aforementioned MTC, which is a block-based image coder that uses texture similarity metrics to decide if blocks of the image can be encoded by pointing to perceptually similar ones in the already coded region. The key to its success is an effective texture similarity metric, such as an LRI-based metric, and an effective search strategy. Compared to traditional image compression algorithms, e.g., JPEG, MTC achieves similar coding rate with higher reconstruction quality. And the advantage of MTC becomes larger as coding rate decreases.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113586/1/yhzhai_1.pd

    OBJECTIVE SIMILARITY METRICS FOR SCENIC BILEVEL IMAGES

    Full text link
    This paper proposes new objective similarity metrics for scenic bilevel images, which are images containing natural scenes such as landscapes and portraits. Though percentage error is the most commonly used similarity metric for bilevel images, it is not always consistent with human perception. Based on hypotheses about human perception of bilevel images, this paper proposes new metrics that outperform percentage error in the sense of attaining significantly higher Pearson and Spearman-rank correlation coefficients with respect to subjective ratings. The new metrics include Adjusted Percentage Error, Bilevel Gradient Histogram and Connected Components Comparison. The subjective ratings come from similarity evaluations described in a companion paper. Combinations of these metrics are also proposed, which exploit their complementarity to attain even better performance.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/111058/4/OBJECTIVE SIMILARITY METRICS FOR SCENIC BILEVEL IMAGES.pd

    Similarity of Scenic Bilevel Images

    Full text link
    This paper has been submitted to IEEE Transaction on Image Processing in May 2015.This paper presents a study of bilevel image similarity, including new objective metrics intended to quantify similarity consistent with human perception, and a subjective experiment to obtain ground truth for judging the performance of the objective similarity metrics. The focus is on scenic bilevel images, which are complex, natural or hand-drawn images, such as landscapes or portraits. The ground truth was obtained from ratings by 77 subjects of 44 distorted versions of seven scenic images, using a modified version of the SDSCE testing methodology. Based on hypotheses about human perception of bilevel images, several new metrics are proposed that outperform existing ones in the sense of attaining significantly higher Pearson and Spearman-rank correlation coefficients with respect to the ground truth from the subjective experiment. The new metrics include Adjusted Percentage Error, Bilevel Gradient Histogram and Connected Components Comparison. Combinations of these metrics are also proposed, which exploit their complementarity to attain even better performance. These metrics and the ground truth are then used to assess the relative severity of various kinds of distortion and the performance of several lossy bilevel compression methods.http://deepblue.lib.umich.edu/bitstream/2027.42/111737/2/Similarity of Scenic Bilevel Images.pdfDescription of Similarity of Scenic Bilevel Images.pdf : Main article ("Similarity of Scenic Bilevel Images"

    Towards Generic Image Manipulation Detection with Weakly-Supervised Self-Consistency Learning

    Full text link
    As advanced image manipulation techniques emerge, detecting the manipulation becomes increasingly important. Despite the success of recent learning-based approaches for image manipulation detection, they typically require expensive pixel-level annotations to train, while exhibiting degraded performance when testing on images that are differently manipulated compared with training images. To address these limitations, we propose weakly-supervised image manipulation detection, such that only binary image-level labels (authentic or tampered with) are required for training purpose. Such a weakly-supervised setting can leverage more training images and has the potential to adapt quickly to new manipulation techniques. To improve the generalization ability, we propose weakly-supervised self-consistency learning (WSCL) to leverage the weakly annotated images. Specifically, two consistency properties are learned: multi-source consistency (MSC) and inter-patch consistency (IPC). MSC exploits different content-agnostic information and enables cross-source learning via an online pseudo label generation and refinement process. IPC performs global pair-wise patch-patch relationship reasoning to discover a complete region of manipulation. Extensive experiments validate that our WSCL, even though is weakly supervised, exhibits competitive performance compared with fully-supervised counterpart under both in-distribution and out-of-distribution evaluations, as well as reasonable manipulation localization ability.Comment: Accepted to ICCV 2023, code: https://github.com/yhZhai/WSC

    SOAR: Scene-debiasing Open-set Action Recognition

    Full text link
    Deep learning models have a risk of utilizing spurious clues to make predictions, such as recognizing actions based on the background scene. This issue can severely degrade the open-set action recognition performance when the testing samples have different scene distributions from the training samples. To mitigate this problem, we propose a novel method, called Scene-debiasing Open-set Action Recognition (SOAR), which features an adversarial scene reconstruction module and an adaptive adversarial scene classification module. The former prevents the decoder from reconstructing the video background given video features, and thus helps reduce the background information in feature learning. The latter aims to confuse scene type classification given video features, with a specific emphasis on the action foreground, and helps to learn scene-invariant information. In addition, we design an experiment to quantify the scene bias. The results indicate that the current open-set action recognizers are biased toward the scene, and our proposed SOAR method better mitigates such bias. Furthermore, our extensive experiments demonstrate that our method outperforms state-of-the-art methods, and the ablation studies confirm the effectiveness of our proposed modules.Comment: Accepted to ICCV 2023, code:https://github.com/yhZhai/SOA

    Scenic bilevel image similarity metrics MATLAB code

    Full text link
    This item contains MATLAB code for scenic bilevel image similarity metrics described in the following two papers: (1) Y. Zhai and D.L. Neuhoff, Similarity of Scenic Bilevel Images, to appear in IEEE Transaction on Image Processing, 2016. (2) Y. Zhai, D.L. Neuhoff and T.N. Pappas, Objective Similarity Metrics for Scenic Bilevel Images, IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 2793-2797, Florence, Italy, May 2014.http://deepblue.lib.umich.edu/bitstream/2027.42/122736/1/Scenic bilevel image similarity metrics MATLAB code.zipDescription of Scenic bilevel image similarity metrics MATLAB code.zip : MATLAB cod

    SUBJECTIVE SIMILARITY EVALUATION FOR SCENIC BILEVEL IMAGES

    Full text link
    In order to provide ground truth for subjectively comparing compression methods for scenic bilevel images, as well as for judging objective similarity metrics, this paper describes the subjective similarity rating of a collection of distorted scenic bilevel images. Unlike text, line drawings, and silhouettes, scenic bilevel images contain natural scenes, e.g., landscapes and portraits. Seven scenic images were each distorted in forty-four ways, including random bit flipping, dilation, erosion and lossy compression. To produce subjective similarity ratings, the distorted images were each viewed by 77 subjects. These are then used to compare the performance of four compression algorithms and to assess how well percentage error and SmSIM work as bilevel image similarity metrics. These subjective ratings can also provide ground truth for future tests of objective bilevel image similarity metrics.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/111057/4/SUBJECTIVE SIMILARITY EVALUATION FOR SCENIC BILEVEL IMAGES.pd

    Bilevel Image Similarity Ground Truth Archive

    Full text link
    The data in this file is intended for scholarly, non-commercial use only. The images cannot be re-distributed. Copyright to the data is retained by Yuanhao Zhai and David L. Neuhoff. If issues arise, please contact [email protected] or [email protected] archive contains a set of seven bilevel images, the same images distorted in a number of ways and to a number of different degrees, subjective rating scores for each distorted image as to its similarity to its corresponding original, and the amounts of time required for the rating of each image.http://deepblue.lib.umich.edu/bitstream/2027.42/111059/3/Bilevel Image Similarity Ground Truth Archive.zi
    corecore